NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Ensemble Kalman methods: A mean-field perspective

https://doi.org/10.1017/S0962492924000060

Calvello, Edoardo; Reich, Sebastian; Stuart, Andrew M (July 2025, Acta Numerica)

Ensemble Kalman methods, introduced in 1994 in the context of ocean state estimation, are now widely used for state estimation and parameter estimation (inverse problems) in many arenae. Their success stems from the fact that they take an underlying computational model as a black box to provide a systematic, derivative-free methodology for incorporating observations; furthermore the ensemble approach allows for sensitivities and uncertainties to be calculated. Analysis of the accuracy of ensemble Kalman methods, especially in terms of uncertainty quantification, is lagging behind empirical success; this paper provides a unifying mean-field-based framework for their analysis. Both state estimation and parameter estimation problems are considered, and formulations in both discrete and continuous time are employed. For state estimation problems, both the control and filtering approaches are considered; analogously for parameter estimation problems, the optimization and Bayesian perspectives are both studied. As well as providing an elegant framework, the mean-field perspective also allows for the derivation of a variety of methods used in practice. In addition it unifies a wide-ranging literature in the field and suggests open problems.
more » « less
Free, publicly-accessible full text available July 1, 2026
Second order ensemble Langevin method for sampling and inverse problems

https://doi.org/10.4310/CMS.250517001811

Liu, Ziming; Stuart, Andrew; Wang, Yixuan (January 2025, Communications in Mathematical Sciences)

We propose a sampling method based on an ensemble approximation of second order Langevin dynamics. The log target density is appended with a quadratic term in an auxiliary momentum variable and damped-driven Hamiltonian dynamics introduced; the resulting stochastic differential equation is invariant to the Gibbs measure, with marginal on the position coordinates given by the target. A preconditioner based on covariance under the law of position coordinates under the dynamics does not change this invariance property, and is introduced to accelerate convergence to the Gibbs measure. The resulting mean-field dynamics may be approximated by an ensemble method; this results in a gradient-free and affine-invariant stochastic dynamical system with desirable provably uniform convergence properties across the class of all Gaussian targets. Numerical results demonstrate the potential of the method as the basis for a numerical sampler in Bayesian inverse problems, beyond the Gaussian setting.
more » « less
Full Text Available
Efficient, multimodal, and derivative-free bayesian inference with Fisher–Rao gradient flows

https://doi.org/10.1088/1361-6420/ad847b

Chen, Yifan; Huang, Daniel_Zhengyu; Huang, Jiaoyang; Reich, Sebastian; Stuart, Andrew M (October 2024, Inverse Problems)

In this paper, we study efficient approximate sampling for probability distributions known up to normalization constants. We specifically focus on a problem class arising in Bayesian inference for large-scale inverse problems in science and engineering applications. The computational challenges we address with the proposed methodology are: (i) the need for repeated evaluations of expensive forward models; (ii) the potential existence of multiple modes; and (iii) the fact that gradient of, or adjoint solver for, the forward model might not be feasible. While existing Bayesian inference methods meet some of these challenges individually, we propose a framework that tackles all three systematically. Our approach builds upon the Fisher–Rao gradient flow in probability space, yielding a dynamical system for probability densities that converges towards the target distribution at a uniform exponential rate. This rapid convergence is advantageous for the computational burden outlined in (i). We apply Gaussian mixture approximations with operator splitting techniques to simulate the flow numerically; the resulting approximation can capture multiple modes thus addressing (ii). Furthermore, we employ the Kalman methodology to facilitate a derivative-free update of these Gaussian components and their respective weights, addressing the issue in (iii). The proposed methodology results in an efficient derivative-free posterior approximation method, flexible enough to handle multi-modal distributions: Gaussian Mixture Kalman Inversion (GMKI). The effectiveness of GMKI is demonstrated both theoretically and numerically in several experiments with multimodal target distributions, including proof-of-concept and two-dimensional examples, as well as a large-scale application: recovering the Navier–Stokes initial condition from solution data at positive times.
more » « less
Full Text Available
Operator Learning Using Random Features: A Tool for Scientific Computing

https://doi.org/10.1137/24M1648703

Nelsen, Nicholas H; Stuart, Andrew M (August 2024, SIAM Review)

Supervised operator learning centers on the use of training data, in the form of input-output pairs, to estimate maps between infinite-dimensional spaces. It is emerging as apowerful tool to complement traditional scientific computing, which may often be framedin terms of operators mapping between spaces of functions. Building on the classical ran-dom features methodology for scalar regression, this paper introduces the function-valuedrandom features method. This leads to a supervised operator learning architecture thatis practical for nonlinear problems yet is structured enough to facilitate efficient trainingthrough the optimization of a convex, quadratic cost. Due to the quadratic structure, thetrained model is equipped with convergence guarantees and error and complexity bounds,properties that are not readily available for most other operator learning architectures. Atits core, the proposed approach builds a linear combination of random operators. Thisturns out to be a low-rank approximation of an operator-valued kernel ridge regression al-gorithm, and hence the method also has strong connections to Gaussian process regression.The paper designs function-valued random features that are tailored to the structure oftwo nonlinear operator learning benchmark problems arising from parametric partial differ-ential equations. Numerical results demonstrate the scalability, discretization invariance,and transferability of the function-valued random features method.
more » « less
Full Text Available
A framework for machine learning of model error in dynamical systems

https://doi.org/10.1090/cams/10

Levine, Matthew; Stuart, Andrew (October 2022, Communications of the American Mathematical Society)

The development of data-informed predictive models for dynamical systems is of widespread interest in many disciplines. We present a unifying framework for blending mechanistic and machine-learning approaches to identify dynamical systems from noisily and partially observed data. We compare pure data-driven learning with hybrid models which incorporate imperfect domain knowledge, referring to the discrepancy between an assumed truth model and the imperfect mechanistic model as model error. Our formulation is agnostic to the chosen machine learning model, is presented in both continuous- and discrete-time settings, and is compatible both with model errors that exhibit substantial memory and errors that are memoryless. First, we study memoryless linear (w.r.t. parametric-dependence) model error from a learning theory perspective, defining excess risk and generalization error. For ergodic continuous-time systems, we prove that both excess risk and generalization error are bounded above by terms that diminish with the square-root of T T , the time-interval over which training data is specified. Secondly, we study scenarios that benefit from modeling with memory, proving universal approximation theorems for two classes of continuous-time recurrent neural networks (RNNs): both can learn memory-dependent model error, assuming that it is governed by a finite-dimensional hidden variable and that, together, the observed and hidden variables form a continuous-time Markovian system. In addition, we connect one class of RNNs to reservoir computing, thereby relating learning of memory-dependent error to recent work on supervised learning between Banach spaces using random features. Numerical results are presented (Lorenz ’63, Lorenz ’96 Multiscale systems) to compare purely data-driven and hybrid approaches, finding hybrid methods less datahungry and more parametrically efficient. We also find that, while a continuous-time framing allows for robustness to irregular sampling and desirable domain- interpretability, a discrete-time framing can provide similar or better predictive performance, especially when data are undersampled and the vector field defining the true dynamics cannot be identified. Finally, we demonstrate numerically how data assimilation can be leveraged to learn hidden dynamics from noisy, partially-observed data, and illustrate challenges in representing memory by this approach, and in the training of such models.
more » « less
Full Text Available
Ensemble Kalman inversion for sparse learning of dynamical systems from time-averaged data

https://doi.org/10.1016/j.jcp.2022.111559

Schneider, Tapio; Stuart, Andrew M.; Wu, Jin-Long (December 2022, Journal of Computational Physics)

Full Text Available
Harnessing AI and computing to advance climate modelling and prediction

https://doi.org/10.1038/s41558-023-01769-3

Schneider, Tapio; Behera, Swadhin; Boccaletti, Giulio; Deser, Clara; Emanuel, Kerry; Ferrari, Raffaele; Leung, L. Ruby; Lin, Ning; Müller, Thomas; Navarra, Antonio; et al (September 2023, Nature Climate Change)

Full Text Available
Efficient derivative-free Bayesian inference for large-scale inverse problems

https://doi.org/10.1088/1361-6420/ac99fa

Huang, Daniel Zhengyu; Huang, Jiaoyang; Reich, Sebastian; Stuart, Andrew M (October 2022, Inverse Problems)

Abstract We consider Bayesian inference for large-scale inverse problems, where computational challenges arise from the need for repeated evaluations of an expensive forward model. This renders most Markov chain Monte Carlo approaches infeasible, since they typically require O ( 1 0 4 ) model runs, or more. Moreover, the forward model is often given as a black box or is impractical to differentiate. Therefore derivative-free algorithms are highly desirable. We propose a framework, which is built on Kalman methodology, to efficiently perform Bayesian inference in such inverse problems. The basic method is based on an approximation of the filtering distribution of a novel mean-field dynamical system, into which the inverse problem is embedded as an observation operator. Theoretical properties are established for linear inverse problems, demonstrating that the desired Bayesian posterior is given by the steady state of the law of the filtering distribution of the mean-field dynamical system, and proving exponential convergence to it. This suggests that, for nonlinear problems which are close to Gaussian, sequentially computing this law provides the basis for efficient iterative methods to approximate the Bayesian posterior. Ensemble methods are applied to obtain interacting particle system approximations of the filtering distribution of the mean-field model; and practical strategies to further reduce the computational and memory cost of the methodology are presented, including low-rank approximation and a bi-fidelity approach. The effectiveness of the framework is demonstrated in several numerical experiments, including proof-of-concept linear/nonlinear examples and two large-scale applications: learning of permeability parameters in subsurface flow; and learning subgrid-scale parameters in a global climate model. Moreover, the stochastic ensemble Kalman filter and various ensemble square-root Kalman filters are all employed and are compared numerically. The results demonstrate that the proposed method, based on exponential convergence to the filtering distribution of a mean-field dynamical system, is competitive with pre-existing Kalman-based methods for inverse problems.
more » « less
Full Text Available
Iterated Kalman methodology for inverse problems

https://doi.org/10.1016/j.jcp.2022.111262

Huang, Daniel Zhengyu; Schneider, Tapio; Stuart, Andrew M. (August 2022, Journal of Computational Physics)

Full Text Available
Ensemble Inference Methods for Models With Noisy and Expensive Likelihoods

https://doi.org/10.1137/21M1410853

Dunbar, Oliver R.; Duncan, Andrew B.; Stuart, Andrew M.; Wolfram, Marie-Therese (June 2022, SIAM Journal on Applied Dynamical Systems)

Full Text Available

« Prev Next »

Search for: All records